IAES International Journal of Artificial Intelligence (IJ-AI) 
Vol. 11, No. 4, December 2022, pp. 1507~1516 
ISSN: 2252-8938, DOI: 10.1159 1/ijai.v11.i4.pp1507-1516 O 1507 


Identification of polar liquids using support vector machine 
based classification model 


Thushara Haridas Prasanna’, Mridula Shantha!, Anju Pradeep', Pezholil Mohanan? 
'Division of Electronics, School of Engineering, Cochin University of Science and Technology, Kerala, India 
?Advanced Centre for Atmospheric Radar Research, Cochin University of Science and Technology, Cochin, Kerala, India 


Article Info ABSTRACT 

Article history: The dispersive nature of polar liquids creates ambiguity in their 
; identification process. It requires a long time and effort to compare the 

Received Dec 29, 2021 measured values with the available standard values to identify the unknown 

Revised May 20, 2022 liquid. Nowadays machine learning techniques are being used widely to 

Accepted Jun 18, 2022 assist the measurement techniques and make predictions with great accuracy 


and less human effort. This paper proposes a support vector machine (SVM) 
based classification model for the identification of six polar liquids-butan-1- 
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Machine learning complex permittivity is less than +6% of the standard value in NPL report, 
Polar liquids the proposed model identifies the liquids with 100% accuracy in the entire 
Support vector machine temperature and frequency range. The performance of the model is validated 
by testing the model with data external to the dataset used. The findings 
show that the proposed model is a useful and efficient tool for identifying 

unknown polar liquids. 
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1. INTRODUCTION 

Material characterization plays a major role in many applications like material processing, 
bioengineering, medical treatment and food industry. Materials can be characterized based on their 
electromagnetic properties like permittivity, permeability and conductivity. Measurements in the microwave 
frequency range focus on the complex permittivity (e*) rather than the permeability and conductivity. 
Complex permittivity gives insight into the structure of the material, the temperature in the surroundings and 
the number of impurities in it. The real component of complex permittivity reflects the dielectric medium's 
ability to retain energy, whereas the imaginary part describes the medium's energy losses. Dispersion is the 
fluctuation of ¢* with frequency. At microwave frequencies, the effect of orientation polarization is 
responsible for dispersion [1]. In the case of dispersive materials, repeated measurements at different 
temperatures and frequencies are required to study the dielectric dispersion characteristics [2]. This work 
looks at polar liquids that are dispersive in nature. These liquids are utilised in specific absorption rate (SAR) 
metrology because their complicated permittivity is comparable to that of biological tissue metrology [2], [3]. 
Polar liquids in their pure form can be employed as an excellent calibration material in the field of dielectric 
instrumentation. National Physical Laboratory (NPL) of UK has conducted a detailed study on the dispersion 
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characteristics of commonly used polar liquids using coaxial cell permittivity measurement technique. 
NPL report MAT 23 describes the e* of polar liquids-butan-1-ol, dimethyl sulphoxide (dmso), ethanediol, 
ethanol, methanol and propan-1-ol for a frequency range of 0.1 GHz-5 GHz and temperature range of 10 °C- 
50 °C [2]. The dispersive nature of these liquids makes the identification process a complex task. 

According to related studies, unknown liquids can be identified by measuring its properties like 
density, melting point, boiling point, solubility and then comparing the results to the values of known liquids. 
It takes lot of experimental procedures and is very time consuming. Nuclear magnetic resonance (NMR) 
spectroscopy followed by Fourier transform method has been used in the identification of unknown 
alcohols [4]. NMR spectroscopy is a non destructive technique but it is very expensive. Another method used 
to identify the liquids is using surface acoustic mode aluminum nitride (AIN) transducer [5]. This method is 
very useful to test small amount of liquids but the identification process is time consuming. Complex 
permittivity measurement is a powerful tool in the identification of liquids. The most popular measurement 
techniques are transmission and reflection line, open ended probe and resonant methods [6]. In the 
transmission and reflection line method, from the measured values of transmission and reflection coefficients 
the e* is extracted with the help of Nicholson-Ross-Weir method. Open ended coaxial probe method uses 
rational function model to extract e*. In resonant method, measurement of quality factor and shift in resonant 
frequency are used to extract s*. In all these cases related to complex permittivity measurement, 
identification of an unknown polar liquid requires measured value of complex permittivity followed by 
manual search in the available standard report for close matching. This takes a lot of time especially in the 
case of polar liquids because of its dispersive nature. 

Support vector machine (SVM) based classification model is proposed in this paper for the 
identification of the polar liquids. Complex permittivity (e’ and &”) of the liquid, frequency (f) and 
temperature (t) of the measurement system that are very relevant to all permittivity measurement techniques 
are used as the input features to identify the polar liquids. The paper is structured as: research method, results 
and discussion, graphical user interface and conclusion 


2. RESEARCH METHOD 

Machine learning (ML) algorithms have the ability to recognize data and separate them into 
categories. This process is known as classification. This can be used to identify the group membership of 
the new data instances. The workflow of the classification based ML model used in this work is shown in 
Figure 1. The data obtained from NPL report is used to develop the model. The parameters ¢’, «”, f and t 
which are very relevant in a permittivity measurement system are taken as the input features. Model is trained 
using training set and hyperparameters are tuned to get the best accuracy. The performance of different ML 
techniques is evaluated using performance measures and the most suitable one is selected. Complex 
permittivity can be measured using several techniques in which the most commonly used methods are coaxial 
probe, coaxial cell and planar sensors. Measurement using coaxial probe and coaxial cell are well suited for 
wide range of frequencies whereas planar sensors have limited frequency of operation. Since coaxial cell 
based complex permittivity measurement is used in NPL report, the ability of the model to identify the polar 
liquids based on other measurement techniques is also verified in this work. 
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Figure 1. Workflow of the proposed methodology 
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2.1. Data gathering and pre—processing 

The data for this work was gathered from NPL MAT 23 report. Table 1 shows the dataset 
description. The range of permittivity of six polar liquids for a temperature range of 10 °C-50 °C and 
frequency range of 0.1 GHz-5 GHz is shown in Table 2 [2]. This is a classification problem with multiple 
classes that includes the six polar liquids as the six classes. These classes are assigned with values from 
1 to 6. For better use of the dataset of 1,000 samples, K—fold cross validation is performed on the whole data. 
This splits the dataset into subsets and performs training and testing on different parts of the dataset. 


Table 1. Description of the dataset 


Dataset 
Inputs 
No Features Values 
1 Frequency (f) in GHz 0.1-5 
2 Temperature (T) in °C 10-50 
3 Permittivity (Real part) 3.23-78.5 
4 Permittivity (Imaginary part) 0.3-18.56 
Output 
No Liquid Number of Class 
samples 
1 Butan-1-ol 140 1 
2 Dimethyl! Sulphoxide 140 2 
3 Ethanediol 180 3 
4 Ethanol 180 4 
5 Methanol 180 5 
6 Propan-1-ol 180 6 
Table 2. Range of permittivity 
a te Permittivity 
Pome hae Real part Imaginary part 
Butan-1-ol 3.23-16.27 0.84-7.9 
Dimethyl Sulphoxide 34.85-47.12 0.3-18.56 
Ethanediol 6.93-43.48 0.94-18.3 
Ethanol 4.93-26.18 0.82-11.13 
Methanol 10.9-35.68 0.44-14.89 
Propan-1-ol 3.66-20.35 1.1-9.35 


2.2. Support vector machine classifier 

The purpose of the SVM classifier is to find the optimum decision boundary that can separate n- 
dimensional space into classes and accurately classify a new data point. For multi-class classification, the 
one-against-all technique is utilised. SVM creates models for each class in this way. The m® SVM is trained 
with all of the data in the m" class having positive labels and all other samples having negative labels when 


using S classes. [7], [8]. Thus with / training data (x1, y1),... (xl, yl), where x; € R" ,i=1,...,landy; € 
{0, 1,..., S} is the class of x; and the m® SVM solves the following minimization 
f= mine =(@™o™+ CYL Em (1) 


er prem 2 


such that (w™)? p(xj) + Db" >1-— E" , ify; =m 

(w™)" p(x) + BD <-14+ E", ify; #m 

&>0, i= 1,2,..1 

The training data x; are mapped to a higher dimensional space by the function @ which is known as 
kernel. Here w is the weight vector, b is the bias term, C is the penalty parameter and € is the slack 


: Sonctibet, tuned ; ee na 2 : 
variable [9]. By minimizing : (w™)? w™ SVM tries to maximize the margin ieee the data in 


different classes. The penalty term C ¥:)_, €7” tries to reduce the number of training errors. For (1) there are S 
decision functions 


(wt)? p(x) + bt, ..., (wS)? p(x) + bS 


x is classified to the class that has largest value of decision function 
Class of x = argmax m=12,.s((w™)? p(x) + b™). 


Identification of polar liquids using support vector machine based ... (Thushara Haridas Prasanna) 


1510 O ISSN: 2252-8938 


In this work, for six polar liquids the number of classes is taken as S=6. K-fold cross validation is 
applied on the dataset and hyperparameters are tuned to get the best performance from the model [8], [10]-[13]. 
For reliable estimates and best results 5-fold cross validation is selected and the hyperparameter C is tuned to 
the value 21 and the kernel is the radial basis function (RBF). (@) [14], [15]. At a time, the 5-fold cross 
validation on the 1,000 sample dataset creates a training set of 800 samples and a test set of 200 samples. 


3. RESULTS AND DISCUSSION 

The complex permittivity of polar liquids varies non-linearly with frequency and temperature. 
Several classifiers are applied to the dataset to learn the complex relationship among data. The performance 
of various classifiers is evaluated using performance measures (accuracy, error, specificity and sensitivity) 
and is presented in Table 3. It is observed that the hyperparameter tuning of SVM with RBF kernel and 
penalty parameter C=21 can separate the six polar liquids with 100% accuracy. The suitability of SVM is 
thus confirmed. The receiver operating characteristics curve (ROC) is shown in Figure 2. The ideal point on 
the figure is in the top left corner, where the false positive rate is 0 and the real positive rate is 1. The area 
under the curve for each class is obtained as 1 [16], [17]. This shows that all the six polar liquids are 
classified with an accuracy of 1. 


Table 3. Performance comparison 


Accuracy Error Specificity Sensitivity 

sisi (%) (6) (%) (%) 
Naive Bayes 59 41 91 59 
Decision Tree 87 13 96 86 
KNN 95 5 99 94 
Random Forest 93 7 93 98 

SVM 1 0 1 1 

6 ROC - SVM 
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Figure 2. ROC for the proposed model 


The accuracy of the proposed SVM model for 5-fold cross validation is obtained as 100% in all the 
5 sets. This confirms the stability of the proposed model. The overfit condition is tested using three different 
methods. Firstly, the confusion matrix of both the training set and test set of the SVM classification is shown 
in Figure 3(a) and Figure 3(b) respectively. The diagonal elements indicate the correct predictions [16]—[18]. 
This shows that the SVM classifier performs well in both the training and the test set and the model is not 
overfitted. Secondly, the accuracy of the model is plotted for the training and test set by varying the penalty 
parameter C as shown in Figure 4. It shows that the model performs well in both the training set and test set 
and the accuracy is one when C=21. This also indicates that the model is not overfitted. Finally, the support 
vectors for each of the six classes have been identified and are shown in Table 4. The number of support 
vectors in each class is substantially smaller than the number of samples in each class [11]. This confirms 
that the model is not overfitted. All codes are written in the programming language Python 3.5 with the 
associated Scikit-learn library [10]. The average training time is 0.02s and the average time taken to test a 
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new data is 0.002s in a computer with an Intel Core i5 processor running at a clock speed of 1.6 GHz and 
equipped with 8 GB of RAM. 
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Figure 3. Confusion matrix (a) Training set, and (b) Test set 
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Figure 4. Accuracy vs penalty parameter (C) 


Table 4. Support vectors for the proposed model 


os Number of samples in Number of 
Clas Pole liguid the cece support vectors 
1 Butan-1-ol 140 29 
2 DMSO 140 17 
3 Ethanediol 180 44 
4 Ethanol 180 42 
5 Methanol 180 32 
6 Propan-1-ol 180 42 


Alcohols like methanol and ethanol are highly volatile and evaporate rapidly. This changes the 
liquid temperature and hence the permittivity. Polar liquids like DMSO and ethanediol are hygroscopic, 
absorb water from atmosphere leading to variation in permittivity. Values of complex permittivity obtained 
from the NPL report are based on the coaxial cell measurement technique. Complex permittivity can be 
measured using different measurement techniques. Robustness of the proposed model needs to be tested for 
data obtained using other measurement techniques as well. The most widely used permittivity measurement 
techniques for liquids using coaxial probe, planar sensors and transverse electromagnetic cell (TEM) are 
considered for this purpose. It has been noted that the measured values of s* differ from the standard value in 
NPL report and the variation depends on the measurement technique, temperature, frequency and the liquids 
used [19]-[29]. The measurement errors in s* associated with different measurement techniques are 
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calculated for all the six liquids and shown in Table 5. This variation is expressed as the maximum 
percentage of error (E) in measured value with respect to the standard value. The measurement error is found 
to be minimum in the case of TEM cell because of the closed structure. A new test set is formed considering 
these measured values and the response of the proposed model is noted. The number of samples in the new 
test set is 40. While preparing the new test set, focus is given to those values in which considerable variations 
occur with respect to the standard value. The proposed model is able to predict all the liquids with 100% 
accuracy. The confusion matrix for the new test set is shown in Figure 5. 


Table 5. Robustness of the proposed model 


Liquid [Ref.No] Measurement F F (GHz) Maximum error in permittivity E (%) Number of Number 
Technique (°C) Real part Imaginary F (GHz) samples of samples 
(e') Part supplied to _ identified 
(e") the model 

Butan-1-ol [20] Probe 30 0.1-1 -16.72 -0.62 0.1 10 10 
DMSO [22] Sensor 25 1.58 2.88 -37.6 1.58 1 1 
Ethanediol [23] TEM Cell 24.2 0.1-4 0.07 0.3 1 6 6 
Ethanol [24] Probe 25 1-5 10.15 8.04 5 10 10 
Methanol [24] Probe 25 1-5 10.49 -0.75 1.5 9 9 
Propan-1-ol [25] Probe 30 0.1-1 -3.68 4.55 0.4 4 4 
Total 40 40 
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Figure 5. Confusion matrix for the new test set 


Permittivity plot of the six liquids in the frequency range of 0.1GHz to 5GHz and room temperature 
25 °C is shown in Figure 6. Real part of permittivity is shown in Figure 6(a) wheras imaginary part is shown 
in Figure 6(b). While the real part of permittivity (e’) decreases with increasing frequency, the imaginary part 
(s”) increases with frequency, reaches a peak at a frequency known as the relaxation frequency (f;) and then 
decreases. In the case of DMSO f, is 8.32 GHz which is above the frequency range specified in Figure 6. 
Since the identification of these liquids are based on the measured value of complex permittivity (e' and &"), 
the error in measurement leads to misclassification. For varied temperatures in the range of 10 °C to 50 °C, a 
modified test set is created by manually incorporating the measurement error that might occur in the entire 
frequency range (0.1 GHz to 5 GHz) during the permittivity measurement as a step variation of +0.5% of the 
standard value of both e’ and e”. The performance of the model in each step is evaluated and the maximum 
measurement error in ¢* that the model can accommodate without any misclassification is found out. For a 
particular temperature, this test set consists of 120 samples with 20 samples for each liquid. It is observed 
that if the measurement error introduced in both ¢’ and &” is equal to 6% of the standard value, the first 
misclassification occurs in the case of butan-1-ol at 10 °C and 0.1 GHz as seen from row 1 of Table 6. The 
accuracy of the model drops to 99 %. For all other temperatures and frequencies the measurement error that 
the model can accommodate without misclassification is greater than 6%. The details of first 
misclassification observed at different temperatures are shown in Table 6. Since most of the measurements 
happen at room temperature 25 °C, the details of misclassification observed is also found out and presented 
in row 4 of Table 6. At 25 °C the first misclassification is observed at 0.1 GHz when the measured value of 
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both e’ and ” decreases by 7.5% of the standard value. The complex permittivity plot at 25 °C and 0.1 GHz 
with a measurement error of -7.5% incorporated is illustrated in Figure 7 and the confusion matrix for the 
same is shown in Figure. 8. These results indicate that if the identification of an unknown liquid is to be 
carried out using the proposed model, well calibrated complex permittivity measurement systems with 
measurement error less than + 6% is to be used to eliminate the chance of misclassification. 
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Figure 6. Permittivity plot at 25 °C (a) Real part, and (b) Imaginary part 


Table 6. Misclassification test of the proposed model 


Misclassification first observed 


Accuracy of the 


T(°C) Errorintroducedin Frequency Liquid Misclassified proposed model 
e* (%) (GHz) result (%) 
10 +6 0.1 Butan-1-ol Propan-1-ol 99 
15 -9 0.2 Ethanol Propan-1-ol 98 
15 -9 0.1 Propan-1-ol Butan-1-ol 98 
20 -8 0.1 Propan-1-ol Butan-1-ol 99 
25 -7.5 0.1 Propan-1-ol Butan-1-ol 99 
30 +10 0.1 Butan-1-ol Propan-1-ol 99 
35 -8 0.1 Butan-1-ol Propan-1-ol 99 
40 -8.5 0.2 Propan-1-ol Butan-1-ol 99 
45 -8 0.1 Propan-1-ol Butan-1-ol 99 
50 -8 0.1 Propan-1-ol Butan-1-ol 99 
10-50 Less than +6% 100 
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Figure 7. Permittivity plot at 25 °C and 0.1 GHz with a measurement error of -7.5% 
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Figure 8. Confusion matrix for the modified test set at 25 °C and 0.1 GHz with a measurement error of -7.5% 


4. GRAPHICAL USER INTERFACE 

The proposed method is aimed to be used as a support tool to assist the measurement techniques. 
Graphical user interface (GUI) is designed for testing of unknown liquids. The inputs to the interface are the 
measured value of complex permittivity of unknown liquid, temperature and the frequency of the 
measurement system. The model identifies the unknown liquid within a fraction of a second. The front end of 
the GUI is designed using hypertext markup language (HTML) and the appearance is improved using 
cascading style sheet (CSS). The back end consists of the machine learning model and the web framework 
which are written in Python. Flask framework is used for the development of the interface [30]. It is a micro 
web framework that loads the machine learning model, takes the input from the front end and returns the 
predicted result. This GUI can be used for the temperature range of 10 °C to 50 °C, frequency range of 0.1—5 
GHz. A warning message is displayed in the front end if the inputs exceed this range and also if the measured 
permittivity is not within the range of the dataset. The appearance of the GUI and an example of the predicted 
result are shown in Figure 9. 


Identify the liquid 


Temperature: 


Identify the liquid 


Temperature: 


Permittivity: Permittivity: 


Permittivity: 


Permittivity: 


Frequency : Frequency : 


Predict 


Liquid is Butanol 


Figure 9. Graphical user interface 


5. CONCLUSION 

Measurements of complex permittivity in the microwave frequency range are well suited for the 
identification of unknown materials. But in the case of polar liquids, because of the dispersive nature, it is 
difficult to identify the liquid even though the permittivity is known or measured. In this work, SVM based 
classification model is implemented using Python for the identification of six polar liquids-Butan-1-ol, 
DMSO, Ethanediol, Ethanol, Methanol and Propan-1l-ol for a temperature range of 10 °C-50 °C and 
frequency range of 0.1 GHz-5 GHz. The identification is done with minimum number of parameters. The 
input parameters are the complex permittivity, frequency and temperature, which are very relevant in a 
measurement system. The accuracy achieved is 100% for the specified temperature and frequency range. The 
performance of the model is validated to confirm that the model is not overfitted. The robustness of the 
model is tested using an external dataset and the performance of the model is found to be good. A GUI is 
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designed for the identification of unknown liquids. Probability of misclassification is also tested by manually 
introducing measurement error in complex permittivity to the standard NPL data. It is observed that the first 
misclassification happens when the measured values of complex permittivity deviates by 6% of standard 
value in the NPL report. All complex permittivity measurement techniques with measurement error less than 
+6% can be incorporated with the proposed model to identify the unknown polar liquid for the temperature 
range of 10 °C-50 °C and frequency range of 0.1 GHz-5 GHz without any misclassification. The response is 
fast and the ambiguity in the identification process is eliminated. The model may be extended to identify 
polar liquid mixtures by suitably extending the dataset. 
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